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The  Director's  Corner 

Steve  Adamec,  NAVO  MSRC  Director 

Looking  Forward  and  Back  at 
the  NAVO  MSRC 


This  year  marks  the  tenth  anniversary  of  the 
establishment  of  the  Primary  Oceanographic 
Prediction  System  (POPS)  supercomputer  cen¬ 
ter  here  at  the  Naval  Oceanographic  Office. 

The  POPS  center,  the  forerunner  of  the  present 
NAVO  MSRC,  initially  offered  its  user  commu¬ 
nity  a  CRAY  Y- M  P/8  system  with  2.7  gigaflops 
of  peak  computing  capability.  Significantly,  it 
was  established  to  simultaneously  serve  both 
research  and  development  (R&D)  and  opera¬ 
tional  high  performance  computing  (HPC) 
requirements  within  the  Navy.  Over  the  past 
ten  years,  the  center  has  increased  its  comput¬ 
ing  capacity  1000-fold  while  continuing  to 
serve  both  the  Department  of  Defense  (DoD) 
R&D  and  Navy  operational  HPC  needs.  The 
focus  on  combined  R&D  and  operational  HPC 
processing  within  one  center  has  yielded  signif¬ 
icant  benefits  to  DoD,  including  high  systems 
availability,  resilient  networking  and  storage 
infrastructure  and  has  dramatically  improved 
scheduling  of  the  largest  HPC  applications. 
These  applications  include  those  associated 
with  the  DoD  High  Performance  Computing 
Modernization  Program  (HPCMP)  challenge 

About  the  Cover: 


projects  and  the  time-critical,  global-scale  H  PC 
applications  for  the  operational  N  avy  commu¬ 
nity,  which  must  run  multiple  times  every  day 
of  the  year. 

As  we  cruise  into  summer,  the  preparations  for 
UGC  2001  are  almost  complete.  This  year's 
conference  promises  to  be  a  good  one,  afford¬ 
ing  us  the  opportunity  to  extend  some  G  ulf 
Coast  hospitality  to  a  large  contingent  of  the 
DoD  user  community.  The  Shared  Resource 
Combined  Advisory  Panel  (SRCAP)  has  done 
an  outstanding  job  of  organizing  this  year's 
event,  and  we  are  privileged  to  assist  them  in 
bringing  it  to  fruition.  We  look  forward  to  see¬ 
ing  old  and  new  friends  in  Biloxi  this  coming 
J  une. 

Finally,  we  bid  farewell  to  Mr.  Terry  Blanchard, 
who  retired  in  March  2001  after  more  than 
thirty  years  of  distinguished  Federal  service. 
Terry's  contributions  to  the  DoD  HPCMP  as 
both  Deputy  Director  and  Director  of  the 
NAVO  MSRC  were  numerous  and  substantial, 
enabling  this  MSRC  to  establish,  sustain,  and 
enhance  a  premiere  HPC  capability  for  the 
DoD  user  community. 
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The  Naval  Oceanographic  Office  (NAVO) 
Major  Shared  Resource  Center  (MSRC): 
Delivering  Science  to  the  Warfighter 

The  NAVO  MSRC  provides  Department  of 
Defense  (DoD)  scientists  and  engineers  with 
high  performance  computing  (HPC)  resources, 
including  leading  edge  computational  systems, 
large-scale  data  storage  and  archiving,  scientific 
visualization  resources  and  training,  and  expert¬ 
ise  in  specific  computational  technology  areas 
(CTAs).  These  CTAs  include  Computational 
Fluid  Dynamics  (CFD),  Climate/Weather/Ocean 
Modeling  and  Simulation  (CWO),  Environmental 
Quality  Modeling  and  Simulation  (EQM), 
Computational  Electromagnetics  and  Acoustics 
(CEA),  and  Signal/lmage  Processing  (SIP). 
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Scalable  Flow  Simulations  with  Rotating 
Components 

Roger  Briley,  Mechanical  Engineering  Department 

Lafayette  K.  Taylor  and  David  L.  Whitfield,  Aerospace  Engineering  Department, 
Computational  Simulation  and  Design  Center,  ERC,  Mississippi  State  University 
http://www.erc.msstate.edu/simcenter 


Scalable  parallel  computing  is  greatly  advancing  the  complexity  of  problems  for  which  analysis  and  design,  based 
on  large-scale  complex  flow  simulations,  is  becoming  feasible.  The  Computational  Simulation  and  Design  Center 
(SimCenter)  at  M  ississippi  State  University's  Engineering  Research  Center  (ERC)  has  developed  scalable  flow  sim¬ 
ulation  software  for  both  multiblock  structured  grids  that  have  arbitrary  block  connectivity  and  for  multielement 
unstructured  grids.  These  flow  solvers  are  capable  of  high-resolution  simulations  for  very  large  Reynolds  numbers 
(i.e.,  — 109). 


Two  current  Office  of  N  aval  Research- 
(ONR-)  sponsored  Department  of 
Defense  (DoD)  Challenge  projects  with 
allocations  at  the  N  aval 
Oceanographic  Office  (NAVO)  MSRC 
and  United  States  (U.S.)  Army 
Engineer  Research  and  Development 
Center  (ERDC)  Major  Shared 
Resource  Center  (MSRC)  are  using 
these  codes  to  study  unsteady  viscous 
flow  phenomena  associated  with 
underwater  vehicles  and  surface  ships. 


Figure  la.  (Left)  1b.  (below) 
Scalability  Properties  from  a 
Semiempirical  Performance  Model 


One  project,  led  by  Dr.  L.  Patrick  Purtell,  ONR,  focuses 
on  submerged  wakes  in  littoral  regions  and  continues  a 
previous  Challenge  project  at  the  Arctic  Region  Super 
Computer  Center  (ARSC)  on  submarine  maneuvering. 
The  second  project,  led  by  Dr.  Ki-H an  Kim,  ONR,  con¬ 
cerns  surface  ship  maneuvering  and  sea  keeping. 
Additionally,  the  National  Aeronautics  and  Space 
Administration  (NASA)  Ames  sponsors  simulations  of 
tilt-rotor  aircraft  flows  and  maneuvering  in  a  study  that 
is  related  to  these  projects  through  a  cooperative  agree¬ 
ment  with  the  N avy. 


n  >w 

iteration  to  provide  scalable  concurrency.  The  semiem 
pirical  performance  model  cited  in  reference  1  is  used 
to  illustrate  some  of  the  scalability  properties  of  the 
structured -grid  flow  solver  for  very  large-scale  prob¬ 
lems,  and  some  examples  of  complex  unsteady  flow 
simulations  with  rotating  components  are  given  from 
recent  work  of  SimCenter  researchers. 

Semiempirical  Model  for  CPU,  Memory,  and 
Cost  Efficiencies 


All  these  projects  require  scalable  parallel  supercomput¬ 
ing  to  address  the  requirements  of  large-scale  unsteady 
viscous-flow  simulations,  past  complex  geometries  with 
dynamic  and  rotating  components. 

The  parallel  algorithms  now  being  used1'2  have  evolved 
over  the  past  ten  years  from  previous  serial  algorithms 
for  the  unsteady  Reynolds-averaged  N avier-Stokes 
equations.  They  combine  multiple-iteration  implicit 
schemes,  characteristic-based  finite-volume  spatial 
approximations,  and  numerical  flux  linearizations  with 
B  lock-J  acobi  G  auss-Seidel  relaxation  for  the  innermost 


A  semiempirical  performance  model  has  been  devel- 
oped'to  study  the  scalability  of  the  parallel  solution 
algorithms  as  actually  implemented  for  existing  and 
hypothetical  computing  platforms  using  Message 
Passing  Interface  (M PI). 

The  parallel  algorithms,  spatial  domain  decomposition, 
and  message-passing  software  framework  were  specifi¬ 
cally  designed  to  provide  scalability  for  complex  flow 
simulations  on  modern  distributed  memory  architec¬ 
tures.  The  codes  have  operated  efficiently  on  T3E, 
Origin2/3K,  Sun  Enterprise,  SP2/3,  and  Unix/Linux 


4 


SPRING  2001 


NAVO  MSRC  NAVIGATOR 


Figure  2.  Propelled  Notional  Submarine  in  Straight-Ahead 
Motion 


clusters.  The  actual  Central  Processing  Unit  (CPU)  times 
and  communications  overhead  are  routinely  measured, 
and  observed  efficiencies  have  been  consistent  with  this 
model  for  cases  run  on  numerous  machines,  currently  up 
to  11  million  points  and  100  processors. 


floating-point  operations  per  second  (M flop)  rate  (as 
compiled),  a  rate  for  loading  and  unloading  of  mes¬ 
sage  buffer  arrays,  MPI  software  bandwidth  and  laten¬ 
cy,  and  the  number  of  processors.  Solution  algorithm 
parameters  are  also  used,  including  number  of  grid 
points  and  subiteration  cycles,  floating-point  opera¬ 
tion  count,  and  number  and  average  length  of  mes¬ 
sages.  Although  the  rotating  interface  conditions  are 
implemented  in  a  scalable  form,  the  performance 
model  does  not  yet  include  messages  for  rotating 
components  or  free  surfaces. 

Performance  Model  Results 

According  to  the  performance  model,  the  optimal  cost 
efficiency  for  distributed-memory  computers  is 
obtained  by  choosing  the  minimum  number  of 
processors  reguired  to  provide  the  necessary  global 
memory,  since  this  gives  100%  memory  efficiency  and 
small  communication/computation  ratio.  If  necessary, 
this  run  time  can  be  reduced  by  increasing  the  num¬ 
ber  of  processors,  although  with  reduced  memory  effi¬ 
ciency  and  increased  communications  overhead. 


For  present  purposes,  the  parallel  CPU  efficiency  Vcpu  = 
Tcpu/T runtime  is  defined  to  be  the  ratio  of  the  total  time 
spent  in  CPU  operations  to  the  total  run  time,  including 
message  passing  communications.  By  definition,  the 
communications  overhead  is  l-ycpu  The  memory  effi¬ 
ciency  vMem  =  MbUsedm ^Reserved is  the  ratio  of  processor 
memory  utilized  during  execution  to  the  total  memory 
reserved. 

The  total  memory  includes  idle  processor  memory  not 
used  during  execution,  adjusted  for  any  shared  memory 
actually  allocated  to  other  users.  An  unused  memory 
overhead  could  be  defined  as  l-*Mem.  Finally,  by  assum¬ 
ing  that  hardware  costs  are  apportioned  as  50%  CPU, 
30%  memory,  and  20%  supporting  hardware,  a  cost  effi¬ 
ciency  for  hardware  resource  utilization  can  be  defined 
as  vC05t=  50%9CPU  +  30%  vMem  + 

20%.  Although  hardware  costs 
obviously  vary,  these  assumed  per¬ 
centages  at  least  approximate  cur¬ 
rent  market  pricing,  and  other  rea¬ 
sonable  estimates  would  not  signifi¬ 
cantly  alter  the  predicted  trends. 

The  performance  model1  estimates 
CPU  and  message-passing  times 
based  on  architectural  parameters 
for  each  specific  machine,  including 
a  measured  effective  CPU  mega 

Figure  3.  Rising  Maneuver  Induced 
by  Sailplane  Motion 


The  model  estimates  for  CPU,  cost,  and  memory  effi¬ 
ciencies  are  shown  in  Figures  la  and  lb  for  a  modern 
but  generic  computer  having  the  following  parame¬ 
ters:  effective  CPU  [100M  flops)  and  buffering 
(. 30Mb/s )  rates,  MPI  bandwidth  ( 130Mb/s )  and  latency 
(15/is),  and  memory  of  512Mb  per  processor. 

These  parameters  are  shown  for  both  memory-con- 
strained  sizeup,  in  which  the  problem  size  is  increased 
to  maintain  vMem  «  100%  as  processors  are  added, 

and  a  constant-problem-size  scaleup  for  10  million 
grid  points. 

As  expected,  the  CPU  efficiency  is  higher  for  memory- 
constrained  sizeup,  but  the  difference  in  cost  efficien- 

Article  Continues... 
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cy  is  more  dramatic  due  to  the  rapid  drop  in  mem¬ 
ory  efficiency  for  constant  problem  size. 

The  most  important  computer  parameters  for  scalability 
are  effective  CPU  rate  as  timed  for  the  executable  oper¬ 
ating  with  message  passing  suppressed, 
the  MPI  software  bandwidth  for  large  mes¬ 
sages,  and  the  time  required  to  load  mes¬ 
sage-passing  buffer  arrays.  The  MPI  laten¬ 
cy  is  negligible  since  there  are  only  a  smal 
number  of  large  messages. 

Overall,  the  performance  model  indicates 
that  the  method  is  scalable  in  a  practical 
sense  for  large-scale  problems. 

Recent  Unsteady  Simulations 

A  number  of  large-scale  simulations  for 
complex  geometries  involving  rotating 
components  have  been  performed  during 
the  past  two  years.  Figure  2  gives  an 
example  of  a  structured-grid  solution  for  a 
propelled  submarine  configuration  (SUB- 
OFF)  in  straight-ahead  motion,  with  a 
Reynolds  number  of  12  million.  Figure  3  shows  a  rising 
maneuver  induced  by  a  prescribed  motion  of  the 
sailplane  control  surfaces. 


Figure  4.  Solution  with  Free  Surface 
for  a  Notional  SWATFI  Hull  Design 
Concept 

Shown  are  the  trajectory  in  submarine 
lengths,  a  closeup  view  of  surface  pressure 
near  the  sailplane,  and  axial  velocity  con¬ 
tours  revealing  tip  vortices  behind  the 
deflected  sailplanes.  This  case  has  4.5  mil¬ 
lion  points,  and  each  hull  length  traveled 
(2200  timesteps)  requires  53  hours  on  50 
T3E  processors  (3.5  GigaFlop  (Gflops), 
vcpu  =  87%).  Figure  4  provides  another  struc¬ 
tured-grid  example  for  a  surface  ship  solution 
with  free  surface. 

Figures  5  shows  an  unstructured -grid  solu¬ 
tion  for  a  notional  submarine  in  straight¬ 
ahead  motion  at  full-scale  Reynolds  number  ( i . e. ,  10^). 
The  sublayer  resolution  for  this  and  other  solutions 
here  is  such  that  y+  <  1.0  at  all  surface  points. 


Figure  5.  Notional  Subm arine  at  Full-Scale  Reynolds  Number(IO^) 


Figure  6  shows  a  second  unstructured-grid  example  for 
a  Model  5415  destroyer  hull  that  includes  both  rotat¬ 
ing  propellers  and  nonlinear  free-surface  conditions. 
This  case,  at  Fr  =  0.28,  is  especially  difficult  because 
of  the  wetted  transom  stern. 

Figure  7  shows  the  computed  maneuvering  trajectory 
and  axial  velocity  contours  at  two  instants  in  time  for  a 
notional  tilt-rotor  aircraft  solution,  obtained  from  a 
combined  unstructured  grid  simulation  and  6DOF 
analysis.  Figures  8a  and  8b  show  a  visualization  of  vortex 
concentrations  behind  a  P5168  marine  propeller. 


Figure  6.  Model  5415  Destroyer  Hull  with  Nonlinear 
Free-Surface  Conditions 
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Finally,  Figure  9  gives  a  validation  comparison  of  both 
structured  and  unstructured  solutions  with  experimen¬ 
tal  measurements  for  the  P5168  propeller, 

A  high-resolution,  time-accurate  solution  for  a  structured 
grid  of  one  million  points  requires  2.1  G  i  gab  it  (G  b)  of 


Figure  8a 
(above),  8b 
(right),  Vortex 
Feature 
Detection  in  a 
Computed  Flow 
for  a  P51 68 
Propeller 


memory  and  about  0.25  processor  hours  per  time  step 
on  a  Sun  ULTRA10000.  The  unstructured-grid  code 
requires  the  same  2.1  G  b  of  memory,  and  although  it 
requires  more  than  double  the  run  time  per  grid  point, 


'9|ii_£Iui4lI  LralnjclunU  c  ■|:hiiiihiiIi'ii 
q  -  i  Kpntirl  -  Jilnum 


Figure  9.  Validation  of  Structured  and  Unstructured  Solutions 
with  Experimental  Measurements  for  a  P51 68  Propeller  (Axial, 
Radial,  and  Circumferential  Velocity) 


Figure  7.  Computed  Trajectory  and  Flow  for  a  Tiltrotor 
Aircraft  Maneuver  Induced  by  a  Sudden  Wind  Gust 
Following  Ice  Buildup  on  Wing  Surfaces 

comparable  viscous  resolution  has  been  achieved  with 
2-5  times  fewer  grid  points  with  unstructured  grids  by 
exploiting  local  control  of  point  distributions. 
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Nanotubes 

J.  Bemholc,  Department  of  Physics,  North  Carolina  State  University,  Raleigh,  NC 


into  each 
other,  and  the 
nanotube 

structure  is  uniquely  defined  by  the  coordinates  of  the  smallest  folding  vector  (n,m)  in 
the  basis  of  lattice  vectors  a  and  b.  The  (n,0)  zigzag  and  (n,n)  armchair  tubes  are  mir¬ 
ror-symmetric;  all  other  tubes  are  chiral,  i.e.,  the  hexagon  bands  wind  around  the  nan¬ 
otube  with  a  non-zero  pitch. 


Carbon  is  unique  among  elements  in 
its  ability  to  assume  a  wide  variety  of 
different  structures  and  forms.  About 
fifteen  years  ago  a  new  family  of  car¬ 
bon  cage  structures,  all  based  on  a 
threefold  coordinated  sp2  network, 
was  discovered.  This  discovery  inau¬ 
gurated  the  science  of  fullerenes. 

Of  these,  C60  is  the  most  abundant 
and  perhaps  the  best-known  member. 

However,  perhaps  the  most  exciting 
among  the  recent  additions  to  the 
fullerene  family  are  carbon  nanotubes, 
discovered  soon  after  the  C60  was 
made  in  quantity.  Carbon  nanotubes 
are  hollow  cylinders  consisting  of 
"rolled-up"  graphitic  sheets,  as  illus¬ 
trated  in  Figures  la  and  lb.  They  are  believed  to  have 
extraordinary  structural,  mechanical,  and  electrical 
properties  that  derive  from  the  special  properties  of 
carbon  bonds,  their  unique  quasi-one-dimensional 
nature,  and  their  cylindrical  symmetry.  For  instance, 
the  graphitic  network  upon  which 
the  nanotube  structure  is  based  is 
well  known  for  its  strength  and  elas¬ 
ticity,  thereby  providing  for 
unmatched  mechanical  strength. 

N  anotubes  can  also  be  metallic  or 
semiconducting,  depending  on  their 
chirality  (see  Figure  la).  This  opens 
up  the  very  interesting  prospects  of 
junctions  and  devices  made  entirely 
of  carbon. 

The  strongest  materials  known 

Nanotubes  have  very  special 
mechanical  properties.  Their  aspect 
ratios  are  enormous,  with  currently 
manufactured  nanotubes  having 
widths  of  1-2  nanometers  and 
lengths  ranging  from  a  fraction  of  a 
micron  to  a  fraction  of  a  millimeter. 

..  .  .  .  .  They  can  be  thought  of 

obtained  by  rolling  a  graphitic  as  fibers,  which  CO U Id  be 
sheet  into  a  cylinder.  employed  to  strengthen 


composite  materials — or  used  directly — when  methods 
to  grow  long  tubes  in  quantity  are  developed. 

Computer  simulations,  which  have  been  confirmed  by 
careful  nanoscale  experiments,  have  shown  that  nan¬ 
otubes  are  extremely  flexible;  they  can  bend  reversibly  to 
very  high  angles  without  exhibiting  any  damage  even  at 
an  atomic  scale.  Supercomputer  simulations  have  also 
predicted  the  immense  strength  of  pure  nanotubes,  more 
than  ten  times  the  strength  of  steel  at  one-sixth  the 
weight.  Recent  calculations  by  Qingzhong  Zhao  and 
Marco  Buongiorno  Nardelli  at  North  Carolina  State 
University  suggest  that  the  effective  strength  of  nan¬ 
otubes  could  even  be  significantly  greater  than  that, 
because  of  the  large  "activation  barriers"  for  atom 
rotation  (see  Figure  2),  which  must  be  overcome  dur¬ 
ing  breakage. 

Single-walled  nanotubes — consisting  of  a  single  cylin¬ 
der — like  to  form  bundles,  or  "ropes,"  while  multiwalled 
nanotubes  are  made  up  of  a  number  of  concentric 
cylinders  which  do  not  necessarily  have  the  same  helic- 
ity,  or  pitch.  Depending  on  the  helicity,  the  nanotubes 
can  be  conducting,  semiconducting,  or  insulating. 

Thus,  they  are  excellent  candidates  for  multifunctional 
materials,  which  provide  both  enormous  strength  as 
well  as  electrically  conducting  or  insulating  properties, 
as  needed.  However,  controlled  growth  of  nanotubes 
with  desired  length  and  pitch  is  still  some  time  off. 


Figure  la. 
Nanotube 
structures 
are  obtained 
by  rolling  a 
graphitic 
sheet  into  a 
cylinder.  (See 
Figure  1b) 
The  points  O 
and  O'  in  the 
graphite 
sheet  fold 


8 


SPRING  2001 


NAVO  MSRC  NAVIGATOR 


When  this  is  accomplished,  even 
more  interesting  applications 
become  possible,  since  calcula¬ 
tions  have  shown  that  while 
some  (armchair)  nanotubes  are 
conducting  even  when  severely 
bent,  other  (chiral)  metallic  nan¬ 
otubes  lose  their  conductivity 
during  bending,  providing  a 
nanoscale  strain  sensor. 

Nanotube-based  devices 

Nanotubes  are  excellent  building 
blocks  for  nanoscale  electronic 
devices.  Due  to  their  small 
dimensions  and  essentially  per¬ 
fect  structure,  a  variety  of  novel 
devices  become  possible,  includ¬ 
ing  nano-electromechanical  sys¬ 
tems  (NE  MS),  efficient  electron 
emitters  for  flat  panel  displays 
and  vacuum  electronics,  nano¬ 
scale  chemical  sensors,  actuators, 
and  even  single-electron  transis¬ 
tors.  Some  of  these  devices  have 
already  been  realized  experimen¬ 
tally,  but  many  obstacles  to  their  use  still  remain. 

In  most  cases,  the  overriding  issue  is  controlled  fabrica¬ 
tion,  but  for  guantum  devices  the  underlying  limitations 
must  also  be  explored.  One  potential  limitation  is  the 
huge  megaohm  resistance  of  nanotube-metal  contacts, 
which  is  100-1000  times  more  than  expected.  Such 
resistance  could  be  due  to  poor  fabrication  in  the 
experimentally  very  difficult  nano-regime,  but  it  could 
also  have  a  fundamental  physical  origin 

O ur  group  has 
thus  embarked  on 
a  comprehensive 
study  of  nanotube- 
metal  contacts  by 
developing  complex 
quantum-mechani¬ 
cal  methods,  which 
can  compute  the 
quantum  transport 
properties  of  electrons 
of  a  nanotube-metal 
contact  coupled  to  an 
external  circuit.  The 


required  calculations  are  very 
demanding  computationally, 
but  the  techniques  developed 
as  part  of  the  M  ultiscale 
Simulations  of  Nanotubes  and 
Quantum  Structures  project 
enable  effective  parallelization 
and  therefore  massively  paral¬ 
lel  execution. 

The  results  of  the  first  such  cal¬ 
culation  are  shown  in  Figure  3, 
which  depicts  the  electron  dis¬ 
tribution  in  a  nanotube-alu- 
minum  contact  and  the  transfer 
of  electron  charge  between  the 
two  systems. 

A  sophisticated  analysis  of  the 
results  shows  that  the  high 
resistance  is  caused  by  a  fun¬ 
damental  reason,  namely  a 
"weak  coupling"  or  a  lack  of 
common  electron  conductance 
channels  between  the  perfect 
nanotube  and  the  metal. 
However,  there  are  several 
ways  in  which  the  coupling 
might  be  enhanced,  including  mechanical  pressure  on 
the  contact  region.  The  contact  could  thus  be  part  of 
a  nanoscale  pressure  sensor,  and  several  other  device 
configurations  are  possible. 

Our  current  work  focuses  on  more  in-depth  investiga¬ 
tions  of  potential  nanotube-based  devices,  and  we  are 
collaborating  with  Department  of  Defense-sponsored 
experimentalists  at  the  U niversity  of  N  orth  Carolina  in 
Chapel  H  ill.  Apart  from  N  E  M  S-structures,  we  are 
also  evaluating  nanotubes  for  battery  applications. 

Early  indications  suggest  that  Li- 
nanotube  batteries 
will  have  higher 
capacities  and  higher 
discharge  and 
recharge  rates  than 
those  based  on 
graphite,  but  a  lot  of 
experimental  and  the¬ 
oretical  research 
still  needs  to  be 
carried  out. 
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Figure  2.  Quantum  molecular  dynamics  simulations 
show  that  nanotubes  initiate  breakage  by  a  bond 
rotation,  where  a  pair  of  atoms  rotates  about  the 
center  of  their  bond  and  converts  four  hexagons 
(highlighted  in  red)  into  a  5-7-7-5  defect.  The  barrier 
for  this  rotation  is  very  high,  which  further  increases 
the  exceptional  strength  of  nanotubes. 


Figure  3.  The  top  panel  shows  the  distribution  of  electrons  in  a  carbon  nanotube 
deposited  on  the  surface  of  aluminum.  The  lower  panel  shows  the  charge  transfer, 
namely  the  electrons  that  left  the  nanotube  (purple)  and  entered  the  metal  (blue). 
The  calculations  were  performed  by  Marco  Buongiorno  Nardelli  and  Jean-Luc 
Fattebert. 
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OpenMP  Parallelization  of  a  3-D  Finite  Element 
Circulation  Model 

Dr.  Timothy  J.  Campbell,  NAVO  MSRC  Programming  Environment  &  Training 
Dr.  Cheryl  Ann  Blain,  Oceanography  Division,  Naval  Research  Laboratory 


The  N aval  Oceanographic  Office  (NAVO)  M ajor  Shared 
Resource  Center  (MSRC)  Programming  Environment 
and  Training  (PET)  program  offers  Department  of 
Defense  (DoD)  researchers  and  engineers  the  opportuni¬ 
ty  to  work  in  close  collaboration  with  PET  analysts  to 
bring  about  serial  and  parallel  optimizations  to  their 
applications. 

One  such  collaboration  occurred  in  a  project  that  suc¬ 
cessfully  ported  an  advanced  three-dimensional  (3-D) 
finite  element  (FE)  circulation  model 
to  shared  memory  parallel  machines. 

The  goal  of  this  project  was  to  pro¬ 
duce,  inasmuch  as  possible,  a  scala¬ 
ble  code  that  required  no  change  to 
the  user  interface  and  configuration 
files,  while  at  the  same  time,  to  edu¬ 
cate  the  researchers  in  parallel  pro¬ 
gramming  techniques. 

OpenMP  multithreading  directives 
were  chosen  to  port  the  model  as 
they  can  provide  a  minimally  intrusive  and  incremental 
method  for  producing  a  parallel  code. 

Physical  &  Mathematical  Model 

Parallelization  efforts  were  focused  on  the  Dartmouth 
College  circulation  model,  QUODDY,  which  repre¬ 
sents  the  most  physically  advanced  finite  element 
model  to  date. 

This  model  is  a  time-marching  simulator  based  on  the 
3-D  hydrodynamic  equations  subject  to  the  conven¬ 
tional  Boussinesq  and  hydrostatic  assumptions.  A 
wave-continuity  form  of  the  mass  conservation  equa¬ 
tion,  designed  to  eliminate  numerical  noise  at  or  below 
two  times  the  grid  spacing,  is  solved  in  conjunction 
with  momentum  conservation  and  transport  equations 
for  temperature  and  salinity. 

Vertical  mixing  is  represented  with  a  level  2.5  turbu¬ 
lence  closure.  This  turbulence  closure  scheme  accounts 
for  processes  occurring  over  the  vertical  extent  of  the 
water  column,  such  as  diffusion,  shear  production, 
buoyancy,  production,  and  dissipation.  Variable  hori¬ 
zontal  resolution  is  provided  on  unstructured  triangular 
meshes.  A  general  terrain-following  vertical  coordinate 
allows  smooth  resolution  of  surface  and  bottom 
boundary  layers. 


The  QUODDY  model  is  dynamically  equivalent  to  the 
often  used  Princeton  Ocean  Model.  The  advantage  of 
the  current  model  lies  in  its  finite  element  formulation 
that  allows  for  greater  flexibility  in  representing  geomet¬ 
ric  complexity  and  strong  horizontal  gradients  in  either 
bathymetry  and/or  velocity. 

Parallel  Implementation 

OpenM  P  is  a  parallel  programming  model  for  shared 

memory  and  distributed  shared  mem¬ 
ory  multiprocessors  that  works  with 
either  standard  Fortran  or  C/C+  +  . 

OpenMP  consists  of  compiler  direc¬ 
tives,  which  take  the  form  of  source 
code  comments,  that  describe  the  par¬ 
allelism  in  the  source  code.  A  support¬ 
ing  library  of  subroutines  is  also  avail¬ 
able  to  applications.  The  OpenM P 
specification  and  related  material  can 
be  found  at  the  OpenM  P  web  site: 
http://www.openmp.org. 

Online  training  in  OpenM  P  is  part  of  the  NAVO  MSRC 
PET  distance  learning  (http://www.navo.hpc.mil/pet/Video) 
and  links  to  other  online  training  material  can  be  found 
at  the  NAVO  PET  Parallel  Computing  Portal  (http://www. 
navo.hpc.mil/Tools/pcomp.html). 

In  Fortran,  OpenM P  compiler  directives  are  structured 
as  comments,  written  as  C$0  M  P  or  !$0  M  P.  An  OpenM  P 
program  begins  as  a  single  process,  called  the  master 
thread.  When  a  parallel  region,  which  is  preceded  by 
either  a  parallel  or  parallel-do  construct,  is  encountered, 
threads  are  forked  to  execute  the  statements  enclosed 
within  the  parallel  construct. 

At  the  end  of  the  parallel  region,  the  threads  synchro¬ 
nize,  and  only  the  master  thread  remains  to  continue 
execution  of  the  program.  The  parallel-do  construct  is 
commonly  discussed  and  provides  a  convenient  and 
incremental  way  to  parallelize  computationally  intensive 
loops  within  a  program. 

The  downside  to  this  approach  is  that  the  creation  of 
threads  at  the  beginning  and  their  subsequent  destruc¬ 
tion  at  the  end  of  the  parallel-do  construct  can  require  a 
large  number  of  cycles.  The  developer  must  be  sure  that 
the  loop  being  parallelized  has  enough  computational 


"T he  goal  of  this  project  was  to 
produce,  inasmuch  as  possible, 
a  scalable  code  that  required 
no  change  to  the  user  interface 
and  configuration  files,  while  at 
the  same  time,  to  educate  the 
researchers  in  parallel  program¬ 
ming  techniques. 
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work  to  make  the  overhead,  due  to  the  Open M P  con¬ 
structs,  worthwhile. 

The  approach  used  in  this  project  is  in  the  spirit  of  the 
Single  Program  Multiple  Data  (SPMD)  model  which  is 
common  in  Message  Passing  Interface  (MPI)  program¬ 
ming.  The  parallel/end  parallel  directives  were  used  to 
enclose  the  entire  time-stepping  portion  of  the  code, 
including  subprogram  calls  within  the  parallel  execution 
region.  Work  decomposition  within  the  parallel  region  is 
based  on  the  horizontal  mesh. 

During  execution  in  the  parallel  region,  the  threads  remain 
in  existence,  and  proper  data  flow  is  ensured  through 
minimal  use  of  the  barrier  synchronization  construct.  Also, 
code  that  must  be 
executed  in  serial 
is  handled  by  the 
master  thread. 

Since  the  barrier 
construct  can  be 
30  to  50  percent 
less  expensive 
than  a  parallel 
do,  this  approach 
significantly 
reduces  the 
amount  of  over¬ 
head  associated 
with  OpenM  P. 

The  QUODDY 
software  model 
consists  of  four 
sets  of  programs 
and  includes  files  for  the  dimensioning  of  variables. 
Parallelization  work  focuses  on  three  of  the  program 
sets  that  consist  of  main,  core,  and  fixed  routines. 

When  a  user  applies  the  QUODDY  application  to  a  par¬ 
ticular  regional  model,  these  three  sets  of  programs 
remain  unmodified.  The  fourth  program  set  consists  of 
user-built  subroutines  that  are  built  with  a  standardized 
interface. 

These  routines  are  used  to  specify  things  such  as  physical 
forcing,  vertical  meshing,  boundary  conditions,  and  the 
manner  in  which  results  are  to  be  analyzed  and  written. 

By  restricting  the  OpenM  P  code  changes  to  the  main, 
core,  and  fixed  routines,  the  user  is  able  to  seamlessly 
apply  the  parallel  QUODDY  to  different  regional  models. 
The  user  need  only  compile  with  the  subroutines  defined 
for  the  regional  model  of  choice. 

Verification  &  Proper  Performance 

Correctness  of  the  parallel  code  execution  has  been  veri¬ 
fied  through  direct  comparison  with  the  original  serial 


code  execution  for  the  Yellow  Sea  Regional  Model  (6847 
horizontal  0  21  vertical  nodes).1  This  verification  was 
done  using  the  full  "seasonal"  mode  in  which  wind  is 
applied  and  temperature  and  salinity  are  transported 
prognostically. 

Since  the  user-defined  output  data  was  of  limited  preci¬ 
sion,  verification  was  done  by  directly  comparing  (at  full 
precision)  all  time-integrated  variables.  Possible  race 
conditions  were  "fleshed  out"  by  running  with  the  num¬ 
ber  of  threads  greater  than  the  number  of  processors.  An 
exact  match  between  the  serial  and  parallel  execution 
has  been  achieved. 

Performance  measurements  were  done  using  the 

Arabian  G ulf 
regional  model 
(17440  horizontal 
nodes  and  either 
21  or  51  vertical 
nodes).2  The  speed¬ 
up  on  p  processors 
is  defined  as  the 
single  processor 
execution  time 
divided  by  the  time 
for  execution  on  p 
processors.  Figure 
1  shows  the  speed¬ 
up  achieved  for  the 
OpenM P  version  of 
QUODDY  on  the 
NAVO  M SRC  Sun 
E10000  (64  proces¬ 
sors  with  64 

Gigabyte  (GB)  shared  memory). 

Two  vertical  grid  resolutions  (21  and  51)  were  meas¬ 
ured.  The  increase  in  vertical  grid  resolution  provides 
more  work  per  horizontal  node,  thus  increasing  the  seal- 
ability  of  the  code.  The  overall  scalability  of  the 
OpenM  P  QUODDY  is  limited  by  the  remaining  serial 
portions  of  work  (about  5  percent,  handled  by  the  mas¬ 
ter  thread)  and  the  synchronization  overhead. 

Impact  &  Application 

The  state-of-the-art  QUODDY  3D  FE  model  is  a  princi¬ 
pal  tool  in  the  NRL  Arabian  Gulf  project,  of  which,  the 
primary  objective  is  development  of  a  circulation  model 
for  the  Arabian  G  ulf  and  connecting  waters  that  realisti¬ 
cally  predicts  the  complex  3-D  circulation  and  mixing  pat¬ 
terns  in  the  region  over  seasonal,  tidal,  sub-tidal,  and  storm 
event  time  scales.  Mesh  resolution  is  variable,  approximate¬ 
ly  3  kilometer  (km)  for  depths  less  then  40  meters  (m)  and 
6  km  elsewhere  out  to  200-m  depth  in  the  Gulf  of  Oman. 

Article  Continues  Page  14... 


Figure  1.  Speed-up  of  OpenMP  QUODDY4  for  the  Arabian  Gulf  (17440  horizontal 
nodes)  on  the  NAVO  Sun  El  0000.  Results  for  two  vertical  mesh  resolutions  are 
shown:  21  vertical  nodes  (blue-filled  squares)  and  51  vertical  nodes  (red-filled  circles). 


NAVO  MSRC  NAVIGATOR 


SPRING  2001 


11 


Researchers  at  the  Naval  Postgraduate  School,  Monterey,  California,  in 
conjunction  with  the  NAVO  MSRC  Visualization  Center  staff,  have  under¬ 
taken  a  long-term  project  to  improve  the  performance  of  the  Multiblock 
Grid  Princeton  Ocean  Model  (MGPOM).  The  use  of  multiblock  grids  in  the 
development  of  ocean  models  facilitates  domain  composition  and  varying 
grid  resolutions  to  provide  the  ability  to  concentrate  grid  resolution  in  the 
dynamic  near-shore  regions  and  save  resolution  in  the  less  dynamic  deep- 
ocean  areas. 

Traditional  one-block  rectangular  grids  (286  x  286,  4  arc  minute  resolu¬ 
tion),  while  invaluable,  consume  large  quantities  of  wall  time,  slowing 
research  and  raising  costs.  For  example,  a  traditional  single  block,  serial 
(vector)  code  takes  approximately  1,116  minutes  (18.6  hours)  of  wall  time 
to  complete  a  10-day  simulation.  In  comparison,  a  29-block  grid  with  the 
same  resolution,  using  MPI-Pthreads  MGPOM  code,  takes  only  27  minutes 
of  wall  time. 

The  model  produced  with  this  new  and  improved 
code  provides  three-dimensional  (3-D)  tempera¬ 
ture,  salinity,  and  circulation  (currents)  data  as 
shown  in  Figures  1  through  5.  These  images  repre- 
.  *  ■  sent  screen  captures  of  an  analysis  environment 

^  built  for  these  researchers  by  the  NAVO  MSRC 

i  ■  v  Visualization  Center  staff.  This  application,  and 

-  others  developed  by  the  Visualization  Center  staff, 

provides  researchers  with  a  portable  analysis  envi- 
ronment  for  ocean  model  output  that  supports  a 
’  variety  of  functions  for  both  the  military  and  civil- 

.  ian  communities. 
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The  model  is  designed  with  modular  dynamics  in 
which  certain  mechanisms,  such  as  heat  flux,  wind 
forcing,  stratification,  tides,  or  river  inflow  can  be 
independently  included  or  excluded  from  model 
equations.  This  modularity  is  used  to  examine  the 
contributions  of  each  component  to  the  overall  circu¬ 
lation  dynamics. 

Simulations  are  forced  by  seasonal  hydrography, 
seasonal  winds,  and  tides.  For  each  month,  the  ini¬ 
tial  temperature  and  salinity  fields  prognostically 
evolve  subject  to  tidal  rectification  and  a  constant 
wind  stress. 

The  summer  mean  circulation  is  primarily  driven  by 
the  baroclinic  pressure  gradient.  Fresh  water  entering 
the  Arabian  Gulf  from  the  Gulf  of  Oman  at  the  sur¬ 
face,  coupled  with  strong  evaporation  in  the  north, 
creates  a  cyclonic  circulation  gyre  that  runs  the  length  of 
the  basin.  The  northwesterly  wind  strengthens  southward 
flow  along  the  western  edge  of  the  gyre.  A  westward 
component  of  the  wind  in  the  southern  Arabian  G  ulf 
pushes  water  across  the  very  shallow  shelf  of  the  U  nited 
Arab  Emirates  (UAE)  coast  and  out  through  the  Strait  of 
H  ormuz  (Figure  2a). 

During  winter,  the  strong  northwest  winds  (3  times  the 
magnitudes  in  summer)  set-up  southeastward  flowing 
coastal  currents  in  the  northern  Gulf  along  each  shoreline 
The  winds  also  impede  penetration  of  the  freshwater  into 
the  G  ulf  and  greatly  reduce  the  strength  of  the  counter¬ 
clockwise  (CCW)  circulation  along  the  axis  of  the  basin 
(Figure  2b). 

In  fact,  the  winds  push  the  circulation  gyre  to  the  south 
and  toward  the  center  of  the  Gulf.  Since  there  is  no  west- 
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Stream  Function  and  Vectors  a 


Figure  2b.  Simulated  winter  seasonal  circulation  in  the  Arabian  Gulf. 
Stream  function  (color)  and  depth-averaged  currents  (vectors). 

erly  component  to  the  wind  in  winter,  the  circulation  on 
the  shallow  southern  shelf  is  quite  complex  and  varied 
from  that  seen  in  summer. 

The  speed-up  achieved  by  the  OpenM P  version  of 
QUODDY  is  immediately  useful  to  the  Arabian  Gulf  and 
other  planned  modeling  work.  Prior  to  porting  QUODDY 
model,  it  would  not  execute  properly  on  the  M  SRC 
resources,  thus  restricting  the  researchers  to  perform  simu¬ 
lations  only  on  their  workstations. 

Performing  10-model-day  seasonal  simulation  experi¬ 
ments  required  up  to  several  days  of  execution  with  limit¬ 
ed  vertical  grid  resolution.  Now,  on  8  processors  of  the 
NAVO  MSRC  Sun  E 10000,  researchers  can  perform  the 
same  10-model-day  seasonal  simulation  with  increased 
vertical  grid  resolution  in  just  over  6  hours  (with  71 
percent  parallel  efficiency). 

The  reduced  turnaround  time  will  greatly  accelerate 
the  model  development  process.  Additionally,  because 
this  was  a  collaborative  effort,  the  researchers  are  now 
familiar  with  the  OpenMP  code  changes  and  are  able 
to  modify  and  improve  the  parallel  code. 


Figure  2a.  Simulated  summer  seasonal  circulation  in  the  Arabian 
Gulf.  Stream  function  (color)  and  depth-averaged  currents  (vectors). 
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Visualization  Center  Video  Production  Studio  Ready  to 
Take  on  21st  Century  in  Style 

Kerry Townson,  Multimedia  Specialist 


At  the  heart  of  the  system  is  a  com¬ 
puter-based  routing  system  that  han- 

Above:  The  main  rack  contains  six  video  sources,  adub-  dlesall  of  the  video, 
bing  station,  routing  hardware,  and  the  base  station  for  one  audio,  synchronization, 
of  the  nonlinear  editors.  The  studio  can  accommodate  eight 

different  videotape  formats  and  a  variety  of  digital  files.  Article  Continues... 


clicks  of  a  mouse  button.  A  sophisti¬ 
cated  routing  system  controlled  by  a 
touch  panel  allows  the  signals  of  any 
tape  deck,  monitor,  or  editing  system 
to  be  fed  to  any  other  device  with  just 
three  taps  on  the  screen. 

Digital  video  has  brought  the  ability  to 
work  entirely  within  the  computer, 
with  pristine  quality  and  amazing  spe¬ 
cial  effects  that  were  impossible  only  a 
few  years  ago  without  hundreds  of 
thousands  of  dollars  worth  of  broad¬ 
cast  equipment.  Video  can  now  be 
fed  to  the  system  either  in  digital  for¬ 
mat  directly  from  the  NAVO  MSRC 
super  computers  or  encoded  from 
analog  videotapes.  Stored  in  digital 
format  on  the  computer's  hard  drives, 
the  video  can  be  fed  back  to  tape 
machines  for  duplication,  saved  to 
digital  files,  and  distributed  via  disk, 
CD-RO  M ,  DVD,  or  written  to  a  net¬ 
work  computer  for  future  use. 


With  an  eye  to  the  future  and  an 
emphasis  on  flexibility,  the 
Visualization  Center  Video  Production 
Studio  (VPS)  was  recently  upgraded 
to  better  fulfill  the  audio-visual 
requirements  of  the  NAVO  MSRC. 


production,  or  automated 
editing.  Routing  tape 
machines  to  different  pieces 
of  equipment  required 
attaching  cables  by  hand  or 
using  an  awkward  patchbay 
device.  The  system's  com¬ 
plexities  often  left  the  video 
editor  wishing  for  extra 
hands  and  reduced  the  effi- 


Left:  The  main  console  houses  two  nonlinear  editing  systems  and  their  host 
computers,  audio  processing  and  monitoring  equipment,  user-assignable 
video  monitors,  and  a  touch  panel  to  perform  signal  routing  tasks. 


Created  eight  years  ago  to  produce 
videotaped  programs  about  the 
MSRC  and  to  support  its  research 
activities,  the  VPS  now  boasts  two 
separate  nonlinear  editing  systems, 
extensive  graphics  capabilities,  and 
the  ability  to  quickly  produce  multi- 
media  products  for  a  variety  of  uses. 
Besides  the  traditional  videotape  out¬ 
put  the  studio  can  produce  materials 
for  multimedia  presentations  on  CD- 
ROM ,  webcasting,  and  Digital  Video 
Discs  (DVD). 

Going  Digital 


The  Hardware 


Like  many  technical  fields,  video 
production  has  moved  to  the  desk¬ 
top.  The  original  design  of  the  VPS 
was  similar  to  a  broadcast  facility, 
with  videotape  machines  feeding  a 
rack  full  of  discrete  components. 
Each  piece  of  equipment  handled 
one  particular  task  such  as  video 
titling,  digital  special  effects,  audio 


ciency  and  expediency  of 
project  completion. 

Advances  in  technology 
have  combined  most  of  the 
tasks  into  a  desktop  comput¬ 
er  environment  that  enables 
an  operator  to  manage  sev¬ 
eral  tape  decks,  special 
effects,  and  audio  with  a  few 


Right:  Audio  processing  equipment  includes  Digital  Audio  Tape  (DAT),  CD- 
ROM,  multiple  audio  effects  units,  and  a  computer-based  Digital  Audio 
Workstation. 
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and  control  signals  from  every  com¬ 
ponent.  A  touch  panel  makes  it  simple 
to  connect  any  machine's  outputs  to 
another's  inputs,  or  to  a  monitor  or 
editing  system. 

The  VPS  now  has  two  nonlinear  edit¬ 
ing  (N  LE)  systems.  The  Trinity  N  LE  is 
used  primarily  for  high-quality  video. 

It  features  all-digital  editing  of 
extremely  high-resolution  images  with 
no  loss  of  quality  during  the  editing 
process,  as  well  as  a  collection  of  eye¬ 
popping  special  effects.  Video  clips 
are  dragged  onto  a  timeline  and  com¬ 
bined  with  graphics,  special  effects, 
and  sound  to  create  polished,  profes¬ 
sional  programs  on  par  with  those 
seen  on  commercial  television. 

A  second  NLE  is  dedicated  to  multi- 
media  applications.  Based  on  a 
Matrox  RT2000  card  installed  in  a 
standard  PC,  this  system  is  used  for 
video  projects  that  will  be  inserted 
into  C  D-RO  M  s  or  PowerPoint  presen¬ 
tations. 

The  VPS  can  also  create  video 
footage  for  its  clients.  A  professional 
3-chip  digital  camcorder  is  available 
for  shooting  on  location.  A  sound 
booth,  16  channel  audio  mixer,  dual 
audio  effects  units,  CD-ROM  player, 
and  a  digital  audio  tape  (DAT) 
recorder  round  out  the  audio  capabili¬ 
ties  of  the  new  system. 

Applications 

The  VPS  provides  a  wide  variety  of 
services  to  the  N  AVO  M  SRC  and  its 
clients.  The  VPS  has  provided  short 


Above:  A  dedicated  sound  booth  pro¬ 
vides  the  ability  to  record  high-quality 
narration  for  multimedia  projects. 


video  clips  and  voiceovers  for  use 
in  two  CD-ROMs  distributed  at 
super  computing  conventions  and 
in  PowerPoint  presentations. 

Visualization  Center  animators  use  the 
facility  to  assemble  high-definition 
renderings  of  computer  data  generat¬ 
ed  by  the  super  computers  of  the 
M  SRC.  A  network  interface  allows  sin¬ 
gle  frames  of  animations  to  be  fed  to 
the  VPS,  assembled  into  a  video  pro¬ 
gram,  and  recorded  onto  disk  or 
videotape.  Real-time  capture  of  super 
computer  displays  permits  the  transfer 
of  interactive  simulations  such  as 
Theater  H  igh  Altitude  Area  Defense 
(THAAD)  and  Miami  Isopycnic 
Computer  Ocean  Model  (MICOM)  to 
videotape. 

Other  projects  include  a  video  intro¬ 
duction  to  the 
NAVO  M  SRC  for 
the  numerous 
grade-school  class¬ 
es  who  visit  the 
facility  each  year 
and  a  more 
detailed  documen- 

Left:  Electronic 
News  Gathering 
(ENG)  equipment 
adds  on  location 
videotaping  to  the 
studio's  capabilities. 


tary  of  the  center's  activities  aimed  at 
adult  visitors.  VPS  projects  are 
viewed  daily  on  large-format  displays 
in  the  Stennis  Space  Center  Visitors 
Center  and  outside  the  NAVO  MSRC 
Visualization  Center. 

With  its  state-of-the-art  equipment 
and  enhanced  capabilities,  the  VPS 
staff  looks  forward  to  providing  a 
new  level  of  service  to  the  NAVO 
MSRC  and  its  clients  needing  audio- 
[visual  support. 
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31  -  Final  Tutorials  Hand¬ 
outs  Due  ^September  7  - 
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NAVO  MSRC  PET  Update 

Eleanor  Schroeder,  NAVO  MSRC  Programming 
Environment  and  Training  Program  (PET)  Government  Lead 


The  end  of  PET  as  we  all  know  it  is 
approaching,  and  we  are  spinning  up 
an  exciting  all-new  PET. 

As  the  PET  program  evolved  over  the 
past  five  years,  it  took  on  a  more 
global,  more  user-oriented,  focus. 
While  the  original  intent  was  to 
improve  the  productivity  of  users  at 
the  Major  Shared  Resource  Centers 
(MSRCs),  the  new  PET  will  expand  to 
incorporate  users  located  at  the 
Distributed  Centers  and  Department 
of  Defense  (DoD)  remote  locations. 

There  will  still  be  four  PET  compo¬ 
nents,  co-located  at  each  MSRC.  Each 
component  is  now  responsible  for 
specific  functional  areas  as  designated 
in  the  box  below. 

We  have  grouped  these  designated 
functional  areas  to  encourage  synergy 
among  related  Computational 
Technology  Areas  (CTAs),  collabora¬ 
tion  and  interaction  between  CTA  and 
cross-community  functions,  and  to 
balance  workloads.  There  are  a  total 
of  fifteen  functional  areas,  ten  of 
which  support  the  ten  established 
CTAs.  The  other  five  are: 

Collaborative  and  Distance 
Learning  Technologies 

This  functional  area  encompasses  not 
only  virtual  meetings  (meetings  with¬ 
out  travel),  but  also  technology  for 
on-line  training,  consultation,  informa¬ 
tion,  and  tutorials.  The  activities  with¬ 
in  this  function  are  expected  to  inter¬ 
act  with  our  training  content  providers 
for  the  development,  testing,  and 
deployment  of  distance  learning  tech¬ 
nology  and  course  material.  We 
expect  strong  interaction  with  the 


Four  PET  Components 

Component  1  (NAVO): 

CWO/EQM;  Computational 
Environment 

Component  2  (ASC): 

F M S/I MT/SIP;  Enabling  Technologies 

Component  3  (ERDC):  CFD/CSM  ; 
PET  Online  Knowledge  Center: 
Education,  Outreach,  and  Training 
Coordination 

Component  4  (ARL): 

CCM/CEA/CEN;  Collaborative  and 
Distance  Learning  Technologies 


Defense  Research  and  Engineering 
Network  (DREN)  initiative  to  ensure 
coordination  and  incorporation  of  col¬ 
laborative  and  distance  learning  tech¬ 
nology  into  the  High  Performance 
Computing  Modernization  Program 
(H PCM P)  networking  and  security 
infrastructure. 

Computational  Environment 

Critical  to  easy  and  effective  use  of 
DoD  High  Performance  Computing 
(H  PC)  systems  resources,  from  the 
high  performance  computer  down  to 
the  desktop,  is  improving  the  usability 
of  computational  environments  at  the 
DoD  Shared  Resource  Centers 
(SRCs). 

These  computational  environments 
encompass  all  aspects  of  the  user's 
interface  to  HPC  resources,  including 
programming  environments  (e.g., 
debuggers,  libraries,  solvers,  higher 
order  languages,  and  performance 
analysis,  and  prediction  and  optimiza¬ 
tion  tools),  computing  platforms  (e.g., 
common  gueuing,  clusters,  distributed 
data,  and  metacomputing),  reusable 
parallel  algorithms,  and  user  access 
tools  (e.g.,  portals  and  web-based 
access  to  HPC  resources). 

Enabling  Technologies 

This  functional  area  involves  advanc¬ 
ing  the  state  of  tools,  algorithms,  and 
standards  for  generalized  run-time 
and  pre-  and  post- pro  cessing  analysis 
on  enormous  datasets.  At  a  minimum, 
this  function  will  entail  visualization, 
data  mining  and  knowledge  discov¬ 
ery,  image  analysis,  grid  generation, 
problem-solving  environments,  and 
computational  techniques  and  meth¬ 
ods  for  the  intelligent  extraction  of 
useful  information  from  data. 

PET  Online  Knowledge  Center  (OKC) 

The  0  KC  will  provide  repositories  for 
PET  programmatic  information  and 
technical  knowledge  in  both  the  com¬ 
putational  science  and  computational 
technology  areas.  It  will  also  provide 
ready  access  to  software  tools  and 
products,  as  well  as  current  informa¬ 
tion  on  PET  projects  in  all  functional 
areas.  Additionally,  the  PET  OKC  will 
allow  H  PCM P  users  and  personnel  to 
enter  a  single  Web  portal  with  one 


navigational  hierarchy,  information 
strategy,  and  search  mechanism,  to 
better  allow  them  to  distinguish  vast 
amounts  of  information  and  expertise 
from  distributed  sites. 

Education,  Outreach,  and 
Training  Coordination 

Under  this  functional  area  we  will 
address  the  efficient  and  productive 
delivery  of  instructional  content  to  the 
DoD  H  PC  user  as  well  as  opportuni¬ 
ties  for  M  inority  Serving  Institutions 
(MSIs),  undergraduate,  graduate,  and 
postdoctoral  students,  and  visiting  sci¬ 
entist/engineer  appointments.  Also, 
this  is  where  we  will  tend  to  the  train¬ 
ing  of  future  DoD  HPC  users. 

The  education  of  both  novice  and 
experienced  HPC  users  in  new  and 
innovative  technologies  is  an  essential 
element  of  this  functional  area.  While 
instructional  content  and  delivery 
technologies  are  addressed  in  other 
PET  functional  areas,  activities  in  this 
functional  area  will  include  coordina¬ 
tion  of  on-site  training  at  the  SRCs 
and  remote  sites,  selection  of  optimal 
training  delivery  methods  and  media, 
and  coordination  of  outreach  forums. 

At  NAVO  MSRC,  we  look  forward  to 
this  new  version  of  PET.  We  believe  it 
will  bring  us  many  new,  exciting  chal¬ 
lenges  and  provide  us  with  closer  ties 
to  the  DOD  HPC  user  community  in 
the  coming  years. 

On  a  personal  note:  As  we  close  the 
door  on  the  last  year  of  PET  as  we 
know  it,  I  would  like  to  take  a 
moment  to  thank  the  many  academ¬ 
ics  that  have  worked  with,  and  contin¬ 
ue  to  work  with,  our  NAVO  MSRC 
team.  Without  your  hard  work  and 
efforts,  we  certainly  would  not  have 
had  as  successful  a  program  as  we 
did.  I  hope  that  our  paths  will  cross 
again  during  this  new  evolution  of  PET. 
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PET  Training  -  Practical  and  Hands  On 

Dr.  Timothy J .  Campbell,  NAVO  MSRC  PET 


Open  MP  Language  and  KAP 
Pro  Tools  for  OpenMP  "Bring 
Your  Own  Code"  Workshop 

In  October  and  November  2000,  the 
Naval  Oceanographic  Office  (NAVO) 
Major  Shared  Resource  Center 
(MSRC)  Program  Environment  and 
Training  Program  (PET)  held  a  four- 
day  "Bring  Your  Own  Code"  workshop 
on  the  OpenM  P  language  and  the 
KAP  Pro  Toolset.  Seventeen  people 
attended  the  workshop  in  the  NAVO 
MSRC  PET  classroom  facilities,  where 
all  code  development  and  application 
runs  were  performed  on  the  NAVO 
MSRC  Sun  E10000. 

The  workshop  was  designed  for  expe¬ 
rienced  Fortran  77/90/95  programmers 
who  have  used  serial  platforms  rang¬ 
ing  from  workstations  to  mainframes. 
Elowever,  no  prior  knowledge  of  pro¬ 
gramming  parallel  computers  was 
assumed.  While  Shared  Memory 
Parallel  (SMP)  platforms  were  the 
workshop  target,  special  attention  was 
devoted  to  issues  related  to  porting 
legacy  code  to  SMP  OpenM P  imple¬ 
mentations. 

The  first  two  days  consisted  of  an 
extensive  OpenM P  language  course 
complete  with  hands-on  exercises.  On 
the  third  day,  students  received  train¬ 
ing  on  the  KAP  Pro  Toolset  for 
OpenMP  provided  by  an  instructor 
from  Intel.  Students  were  invited  to 
"bring  your  own  code"  on  the  fourth 
day  in  order  to  receive  direct  assis¬ 
tance  from  the  instructors  in  paralleliz¬ 
ing  their  code  using  OpenM  P  and  the 
KAP  Pro  Toolset. 

The  student-provided  codes  represent¬ 
ed  Climate/Weather/Ocean  modeling 
(CWO)  applications  ranging  from  sedi¬ 
ment  transport,  to  acoustics,  to  wave 
modeling.  In  a  short  amount  of  time 
several  students  were  able  to  paral¬ 
lelize  computationally  intensive  loops 
within  their  applications  and  achieve  a 


speed-up  of  about  a  factor  of  2  on 
multiple  processors.  Typically,  students 
were  able  to  capture  about  50  to  60% 
of  the  computation  in  parallel  with 
about  one  to  two  hours  of  work. 

NAVO  MSRC  PET  would  be  pleased 
to  repeat  this  workshop  in  the  future.  If 
you  would  like  to  attend  an  OpenM  P 
or  parallelization  workshop  in  the 
future,  please  contact  the  NAVO 
MSRC  PET  Training  Coordinator, 

Brian  Tabor,  attaborb@navo.hpc.mil. 

IBM  ACTC  Applications  on 
the  IBM  SP2  "Bring  Your 
Own  Code"  Workshop 

In  support  of  user  transition  to  the 
NAVO  MSRC  1,336-processor  IBM 
SP2,  NAVO  MSRC  PET  hosted  a 
"Bring  Your  Own  Code"  workshop  in 
December  2000.  Experienced  instruc¬ 
tors  from  the  IBM  Advanced 
Computing  Technology  Center  guided 
12  attendees  through  the  workshop. 

The  three-day  workshop  provided  an 
opportunity  for  NAVO  MSRC  users  to 
learn  about  developing  and  running 
applications  on  the  IBM  SP.  A  detailed 
introduction  was  given  on  hybrid  dis¬ 
tributed/shared  cache-based  parallel 
processors  with  a  focus  on  the  IBM 
Power  3  Winterhawk  and  N  ighthawk 
nodes.  Programming  techniques  for 
optimal  uni-processor  performance 
were  presented,  including  cache  uti¬ 
lization,  stride  elimination,  and 
prefetching.  Useful  performance  analy¬ 
sis  and  debugging  tools  were  discussed 
and  demonstrated.  The  morning  ses¬ 
sions  of  days  two  and  three  covered 
programming  techniques  such  as 
Pth  reads,  OpenMP,  and  Message 
Passing  Interface  (M  PI)  for  shared  and 
distributed  memory  parallelization. 

Students  were  invited  to  bring  their 
own  codes  for  the  day  two  and  three 
afternoon  sessions,  which  were  devot¬ 
ed  to  the  conversion  and  optimization 


of  student  codes.  The  instructors  pro¬ 
vided  direct  assistance  to  students 
from  several  Challenge  and  non- 
Challenge  projects  who  brought  codes 
representing  the  Computational  Fluid 
Dynamics  (CFD),  Climate/Weather/ 
Ocean  Modeling  (CWO),  Environmental 
Quality  Modeling  (EQM),  and 
Computational  Chemistry  and  Materials 
Science  (CCM)  Computation 
Technology  Areas  (CTAs).  For  all  of 
the  attendees,  the  "bring  your  own 
code"  session  was  time  well  spent. 

One  student  was  able  to  resolve  sever¬ 
al  debugging  issues  in  a  hybrid 
M  P  I/O  pen  M  P  wave  modeling  applica¬ 
tion.  Another  student  made  significant 
progress  in  porting  a  CFD  application 
to  M PI. 

If  you  are  interested  in  more  informa¬ 
tion  about  the  NAVO  IBM  SP,  visit 
http://www.navo.hpc.mil/usersupport/ 
IBM. 

2001  Winter  Applied 
MetaComputing/University  of 
Virginia  Legion  Workshop 

The  joint  Legion  Group/Applied 
Metacomputing  Winter  2001 
Workshop  was  held  in  J  anuary  at  the 
University  of  Virginia.  The  workshop 
targeted  all  levels  of  users  and  admin¬ 
istrators,  and  included  information  on 
customizing  and  troubleshooting 
Legion  systems.  Participants  were 
introduced  to  the  Legion  system,  phi¬ 
losophy,  and  architecture  and  were 
given  an  in-depth  user's  point  of  view. 
FI  ands-on  sessions  gave  users  the 
opportunity  to  adapt  and  run  either 
their  own  or  a  test  application  in 
Legion,  while  system  administrators 
had  the  opportunity  to  advanced 
administration  topics  such  as  building 
a  system,  adding  resources  to  an  exist¬ 
ing  system,  and  managing  security. 

For  more  information  on  the  Legion 
workshops,  visit  http://www. legion 
.virginia.edu/workshops.html. 
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Navigator  Tools  and  Tips 

So  You  Were  Given  Hours  to  Run  Your  Model 
on  Something  Called  the  SV1 

Ray  Sheppard,  NAVO  MSRC  User  Support 


That  machine  has  four  host  names: 
Zeus,  Poseidon,  Trident,  and  Athena. 
How  do  you  choose  which  "machine" 
to  run  on7  You've  got  a  500-M  B  input 
data  file  and  don't  have  the  time  to 
wait  on  pulling  it  off  tape  from  the 
mass  storage  after  your  job  starts  run¬ 
ning.  You  will  load  it  into  my/tmp 
directory  first.  You  put  it  in 
/tmp/my_log_name  on  Athena,  sub¬ 
mitted  your  job,  but  got  an  error  file 
that  says  "no  such  file  or  directory.  The 
job  tried  to  run  on  Zeus!  How  do  you 
stop  that? 

Once  you  submit  your  job  to  the 
queue,  normal  users  do  not  have  con¬ 
trol  over  where  they  are  going  to  exe¬ 
cute  the  job.  At  the  moment,  this  is 
only  a  minor  inconvenience  because 
with  only  four  nodes  (machines),  your 
500-M  B  data  file  could  be  copied  into 
four  different /tmp  directories,  and  you 
would  be  good  to  go.  H  owever,  this 
machine  has  the  ability  to  grow  to  32 
nodes  and  that  would  make  pre-stag¬ 
ing  data  a  bit  of  a  chore. 

So,  what  is  the  good  news7  Well,  you 
can  still  accomplish  the  pre-stage  with 
only  two  data  transfers,  and  it  will  not 
matter  how  many  nodes  the  SV1 
becomes.  The  trick  is  to  pick  a  /tmp 
node  that  you  would  like  to  start  from 
and  copy  your  files  there.  Then  you 
can  submit  your  job  which  should 
begin  by  running  a  simple  script  to  first 
test  its  environment,  and  then  copy  the 
/tmp  environment  from  your  node  of 
choice  to  the  node  that  has  been 
selected  for  your  job  to  run.  This  is 
only  a  minor  delay  since  inter-node 
transfers  are  quickly  performed.  Your 
500-M  B  data  file  should  move  in  less 
than  a  minute  (see  statistics  on  a  49- 
M  B  file  below). 


Here  are  a  few  notes  concerning  this 
script: 

Note  1:  You  should  have  a  file  in  your 
home  directory  called  ".rhosts".  This 
file  should  be  amended  to  include  all 
of  the  nodes  with  a  "-hipO"  extension. 
An  example  would  be: 

Obviously,  this  file  should  grow  as  new 
nodes  are  added... 

Note  2:  This  script  is  written  in  C- 


athena 

athena-hipO 

athena . navo . hpc . mil 

athena-hipO . navo . hpc . mil 

trident 

trident-hipO 

trident . navo . hpc . mil 

trident-hipO . navo . hpc . mil 

zeus 

zeus-hipO 

zeus . navo . hpc . mil 

zeus-hipO . navo . hpc . mil 

poseidon 

poseidon-hipO 

poseidon . navo . hpc . mil 

poseidon-hipO . navo . hpc . mil 


shell,  but  it  may  be  called  by  other 
types  of  shells.  If  you  do  not  like  C- 
shell,  I  am  certain  that  comparable 
Bourne,  Korn,  or  shell  of  choice  could 
be  written. 

Note  3:  This  script  may  be  run 
embedded  in  your  Q SUB  job  or  as  an 
executable  from  your  home  directory. 

Note  4:  Finally,  this  script  is  going  to 
look  for  a  small  source  code  file  and  a 
49-M  B  data  file  in  /tmp/ray  on  the  node 
Athena.  The  script  will  time  the  transfers 
(the  example  ran  on  Zeus),  compile 
and  run  the  code,  and  cat  its  contents. 
(This  code  shows  the  accumulated  error 


caused  by  summing  the  same  number 
set  forward  and  then  in  reverse.) 

The  Script  (called  news.host.csh) 

# ! /bin/csh 
set  echo 

setenv  HOST  'hostname' 
echo  $HOST 

if  ( $HOST  !=  "athena") then 
if  (!  -d 

/tmp/ray/ round_error ) then 

mkdir  -p  /tmp/ray/ 
round_error 

endif 

cd  /tmp/ ray/ round_error 

#  Use  the  HiPPI  connection 
for  the  fastest  internal 
transfer  speed. 

timex  rep  athena-hipO: 

/ tmp/ ray/ round_error/ input . f 


timex  rep  athena-hipO: 

/ tmp/ ray/ round_error/ data . d 
at  . 

else 

echo  "I  do  not  need  to 
move  files,  so  do  nothing 
here  &  go  to  work" 

cd  /tmp/ray /round_error 

endif 

# 

pwd 

f90  -o  test. job  input . f 
chmod  755  test . job 
is  -1 
ja 

.  /test . job 
ja  -st 
# 

echo  "  End  of  Job  " 

# 


Article  Continues  Page  22... 
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A  Look  Inside  NAVO 

We  welcome  our  visitors... 


Left: 

Simone  Youngblood, 

Defense  Modeling  and 
Simulation  Office’s 
Verification,  Validation,  and 
I  Authentication  Technical 
1  Director  visit 


Right: 

Captain  Frank  Garcia, 
Office  of  the  Secretary  of 
Defense  visit 


Left: 

Congressional 
Delegation  visit 


Right: 

National  Imagery  and 
Mapping  Agency 
Delegation  visit 


Right: 

Brigadier  General 
George  Cannellos, 
Adjutant  General 
for  Air,  State  of 
Alaska,  visit 


W  Above: 

Commemorative 

Keepsake  Scientific  Computing  2000 
(SC2000).  David  Stinson,  Charles  Ray,  and  Dana 
Allen,  Engineer  Research  and  Development  Center,  Pete 
Grusinskas  and  Eleanor  Schroeder,  NAVO  MSRC 


20 


SPRING  2001 


NAVO  MSRC  NAVIGATOR 


Left: 

Captain  Grandau, 
Prospective 
Commanding  Officer, 
Naval  Pacific 
Meteorology  and 
Oceanography  Center, 
Pearl  Harbor,  visit 


Right 

Colonel  Robert  Allen, 
Air  Force  Weather 
Agency,  visit 


Right: 

Terry’s  Last  Tour  - 

L-R  F 

Steve  Adamec, 
NAVO  MSRC  Director; 

Dr.  Donald  Durham, 
CNMOC 
Technical  Director; 
Terry  Blanchard, 
NAVO  MSRC 
Deputy  Director; 
Landry  Bernard, 
NAVOCEANO 
Technical  Director 


Left: 

Visit  of  Lieutenant 
Governor  Amy 
Tucks,  State  of 
Mississippi  (far  right) 


Above:  Captain  Gunderson,  NAVSEA  visit 
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Article  Continued  from  Page  19 
The  QSUB  Job  (called  news.script) 

#QSUB  -s  /bin/csh 

#  Specifies  the  shell  to 
use 

#QSUB  -q  batch 

#  Specifies  the  queue  name 

#QSUB  -IT  330 

#  Specifies  the  per  request 
CPU  time  limit  in  seconds 

#QSUB  -it  300 

#  Specifies  the  per  process 
CPU  time  limit  in  seconds 

#QSUB  -1M  lOOMw 

#  Specifies  the  per  request 
memory  limit  inmegawords 

#QSUB  -lm  95Mw 

#  Specifies  the  per  process 
memory  limit  in  megawords 

#QSUB  -o 

/u/home/ ray/ news . job . output 

#  Directs  stdout  to  the 
stated  file 

#QSUB  -eo 

#  Merges  stderr  and  stdout 
produced  by  the  job 

# 

#  Execute  the  script 

# 

/u/home/ ray/ news . host . csh 


The  Submission 


athena%  qsub 
-ray/ news . script 

nqs-181  qsub:  INFO 

Request  <1 6 64 8 . athena> : 
Submitted  to  queue  <nqenlb> 
by  <ray (297) > . 

athena% 


The  Output  (header  has  been  deleted) 


+  setenv  HOST  'hostname' 

+  hostname 
+  echo  zeus 
zeus 

+  if  (  zeus  !=  athena  ) 
then 

+  if  (  !  -d 

/tmp/ray/round_error  )  then 

+  mkdir  -p 
/tmp/ray/round_error 

+  endif 


+  cd  /tmp/ray/round_error 

+  f90  -o  test. job  input . f 

+  timex  rep  athena- 

hipO : / tmp/ ray/ round_error/ in 

put . f  . 

+  chmod  755  test. job 

+  Is  -1 

seconds  "clocks" 

total  98784 

real  5.928195  (592819510) 

user  0.007392  (739175) 

-rw-r — r —  1  ray  root  49000000 
Mar  27  15:10  data.dat 

sys  0.065503  (6550348) 

+  timex  rep  athena- 
hipO : /tmp/ 

ray/ round_error/ data . dat  . 

-rw-r — r —  1  ray  root  848  Mar 

27  15:10  input . f 

-rwxr-xr-x  1  ray  root  1542376 
Mar  27  15:11  test. job 

seconds  "clocks" 

+  ja 

real  5.132697  (513269738) 

+  . /test . job 

user  0.012876  (1287588) 

sys  0.752989  (75298936) 

forward  sum  is  = 
485778955946.541 

+  else 

+  pwd 

reverse  sum  is  - 
485778956355.709 

/ tmp/ ray/ round_error 

+  ja  -st 

Job  Accounting  -  Summary  Report 


Job  Accounting  File  Name: 

: /tmp/ nqs .+++++2337/. jacctl3117 

Operating  System: 

unicos  zeus  10.0.0.7  roo.4  CRAY  SVl 

User  Name  (ID) : 

ray  (297) 

Group  Name  (ID) : 

usersup  (139) 

Account  Name  (ID) : 

NA0101  (50003) 

Job  Name  (ID) : 

news.script  (13117) 

Report  Starts: 

03/27/01  15:11:04 

Report  Ends : 

03/27/01  15:11:42 

Elapsed  Time: 

38  Seconds 

User  CPU  Time: 

36.7994  Seconds 

System  CPU  Time: 

1.5433  Seconds 

I/O  Wait  Time  (Locked) : 

0.0254  Seconds 

I/O  Wait  Time  (Unlocked) : 

:  0.0223  Seconds 

CPU  Time  Memory  Integral : 

:  53.4371  Mword-seconds 

SDS  Time  Memory  Integral : 

:  0.0000  Mword-seconds 

I/O  Wait  Time  Memory  Integral:  0.0352  Mword-seconds 

Data  Transferred: 

5.8413  MWords 

Maximum  memory  used: 

1.3945  MWords 

Logical  I/O  Requests: 

1499 

Physical  I/O  Requests: 

4 

Number  of  Commands : 

2 

Billing  Units : 

0.0000 

+  echo  End  of  Job 

End  of  Job 

logout 

athena% 
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Upcoming 

Events 

August  2001 

HPCD  -  10th  International 
Symposium  on  High-Performance 
Distributed  Computing 

7-1 0  August  *  San  Francisco,  California 
Ian  Foster,  itf@mcs.anl.gov 

PDCS  2001  -14th  Annual 
International  Conference  on  Parallel 
and  Distributed  Computing  Systems 

8-1 0  August  *  Dallas,  Texas 
Edwin  Sha,  edsha@utdallas.edu 

DS-RT  2001  -  5th  IEEE  International 
Workshop  on  Distributed  Simulation 
and  Real-Time  Applications 
13-15  August  Cincinnati,  Ohio 
Mark  Pullen,  mpullen@gmu.edu 
www.cs.unt.edu/~boukerch/DS-RT2001 

13th  International  Conference  on 
Parallel  and  Distributed 
Computing  and  Systems 

21-24  August  *  Anaheim,  California 
Carrie  Manchuck,  calgary@iasted.com 
www.iasted.com/conferences/2001/ 
anaheim/pdcs.htm 

September  2001 

PARCO2001  -  Conference  on 
Parallel  Computing 
04-07  September  ¥  Naples,  Italy 
www.parco.org 

November  2001 

Beyond  Boundaries  -  Scientific 
Computing  Conference 
1 0-1 5  November  2001  ^Denver,  Colorado 
www.sc2001  .org 


USERS  GROUP 
CONFERENCE 


JtUie  18-22 


Invited  Speakers: 

Dr*  Delores  Etter  -  DDR&E 
Mr*  Phil  Coyle  -  OT&E 
Dr.  Arthur  Hopkins  -  DTRA 
Dr.  Robert  Ballard  -  IFE 
Dr.  jay  Boris  -  NRL 
Dr*  Aiichiro  Naha  no  -  LJU 
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Naval  oceanographic  Office  *  MAJOR  SHARED  RESOURCE  CENTER 

"1 002  Saleh  Boulevard  .  Slennis  Space  Center.  Mississippi  .  39522 


